AITopics | Databases

Collaborating Authors

Databases

News Overviews Instructional Materials AI-Alerts Classics

SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL

Neural Information Processing SystemsApr-25-2026, 14:06:06 GMT

The Text-to-SQL task, aiming to translate the natural language of the questions into SQL queries, has drawn much attention recently. One of the most challenging problems of Text-to-SQL is how to generalize the trained model to the unseen database schemas, also known as the cross-domain Text-to-SQL task. The key lies in the generalizability of (i) the encoding method to model the question and the database schema and (ii) the question-schema linking method to learn the mapping between words in the question and tables/columns in the database schema. Focusing on the above two key issues, we propose a Structure-Aware Dual Graph Aggregation Network (SADGA) for cross-domain Text-to-SQL. In SADGA, we adopt the graph structure to provide a unified encoding model for both the natural language question and database schema. Based on the proposed unified modeling, we further devise a structure-aware aggregation method to learn the mapping between the question-graph and schema-graph. The structure-aware aggregation method is featured with Global Graph Linking, Local Graph Linking and DualGraph Aggregation Mechanism. We not only study the performance of our proposal empirically but also achieved 3rd place on the challenging Text-to-SQL benchmark Spider at the time of writing.

computational linguistic, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia > China > Guangdong Province (0.28)
North America > United States > Minnesota (0.28)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

0c007ebef1d11fd48da6ce4f54687db6-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsApr-24-2026, 17:08:48 GMT

large language model, machine learning, question answering, (21 more...)

Neural Information Processing Systems

Country: Asia (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.69)
Health & Medicine > Nuclear Medicine (0.68)

Technology:

Information Technology > Databases (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Information Management (0.93)
(4 more...)

Add feedback

WikiDBs: A Large-Scale Corpus Of Relational Databases From Wikidata

Neural Information Processing SystemsMar-20-2026, 06:12:42 GMT

Deep learning on tabular data, and particularly tabular representation learning, has recently gained growing interest. However, representation learning for relational databases with multiple tables is still an underexplored area, which may be attributed to the lack of openly available resources. To support the development of foundation models for tabular data and relational databases, we introduce WikiDBs, a novel open-source corpus of 100,000 relational databases. Each database consists of multiple tables connected by foreign keys. The corpus is based on Wikidata and aims to follow certain characteristics of real-world databases. In this paper, we describe the dataset and our method for creating it. By making our code publicly available, we enable others to create tailored versions of the dataset, for example, by creating databases in different languages. Finally, we conduct a set of initial experiments to showcase how WikiDBs can be used to train for data engineering tasks, such as missing value imputation and column type annotation.

artificial intelligence, database, machine learning, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

RelBench: A Benchmark for Deep Learning on Relational Databases

Neural Information Processing SystemsMar-19-2026, 04:08:37 GMT

RelBench provides databases and tasks spanning diverse domains, scales, and database dimensions, and is intended to be a foundational infrastructure for future research in this direction. We use RelBench to conduct the first comprehensive empirical study of graph neural network (GNN) based predictive models on relational data, as recently proposed by Fey et al. 2024. End-to-end learned GNNs are capable fully exploiting the predictive signal encoded in links between entities, marking a significant shift away from the dominant paradigm of manual feature engineering combined with tabular machine learning. To thoroughly evaluate GNNs against the prior gold-standard we conduct a user study, where an experienced data scientist manually engineers features for each task. In this study, GNNs learn better models whilst reducing human work needed by more than an order of magnitude. This result demonstrates the power of GNNs for solving predictive tasks in relational databases, opening up new research opportunities.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.60)

Technology: